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Abstract — Given a frame in C" which sat- 
isfies a form of the uncertainty principle (as 
introduced by Candes and Tao), it is shown 
how to quickly convert the frame representation 
of every vector into a more robust Kashin's 
representation whose coefficients all have the 
smallest possible dynamic range 0{l/y^i). The 
information tends to spread evenly among these 
coefficients. As a consequence, Kashin's repre- 
sentations have a great power for reduction of 
errors in their coefficients, including coefficient 
losses and distortions. 

Index Terms — Frame representations, 
Kashin's representations, restricted isometries, 
uncertainty principles 

I. Introduction 

Quantization is a representation of continu- 
ous structures with discrete structures. Digital 
signal processing, which has revolutionized 
the modern treatment of still images, video 
and audio, employs quantization as a conver- 
sion step from the analog to digital world. A 
survey of the state-of-the-art of quantization 
prior to 1998 as well as outline of its nu- 
merous applications can be found the paper 
[22] by Gray and Neuhoff. For more recent 
developments, we refer the reader to [15] and 
references therein. 

In this paper, we are interested in robust 
vector encoding and vector quantization. Or- 
thogonal expansions gives a classical way 
to encode vectors in finite dimensions. One 
first chooses a convenient orthonormal basis 
('"i)r=i of C". Then one encodes a vector 
a; e C" by the coefficients (ai)"^^ of i^s 
orthogonal expansion 

n 

X — ajUj, where = {x,Ui). 
1=1 

An example of this situation is the discrete 
Fourier transform. At the next step, one quan- 
tizes the coefficients using a convenient 
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scalar quantizer (for example, a uniform quan- 
tizer with fixed number of levels). 

A drawback of orthogonal expansions is 
that the information contained in the vector 
X may get distributed unevenly among the 
coefficients a^, which makes this encoding 
vulnerable to distortions and losses of the 
coefficients. For example, if x is collinear with 
the first basis vector ui then all the coeffi- 
cients except ai are zero. If the first coefficient 
ai is lost (for example due to transmission 
failure) then we can not reconstruct the vector 
X even approximately. 

A popular way to improve the stability 
of vector encoding is to use redundant sys- 
tems of vectors {ui)fLi in C" called tight 
frames. These are generalizations of orthonor- 
mal bases in the sense that every vector x e 
C" can still be represented as 



N 

aiUi, 

i=l 



where = {x, Ui 



(LI) 



but for N > n frames are clearly linearly 
dependent systems of vectors. These depen- 
dencies cause the information contained in x 
to spread among several frame coefficients a^, 
which improves the stability of such repre- 
sentations with respect to errors (for example 
losses and quantization errors), see e.g. [11], 
[21], [10] and references therein. 

The idea of spreading the information 
evenly among the coefficients is developed in 
the present paper, and in a sense it is pushed 
to its limit. As in the previous approaches, we 
shall start with a frame (ui)fli. instead 
of the standard frame expansions (II. Il l we will 
be looking at expansions x — '^f^iCiiUi 
with coefficients having the smallest possi- 
ble dynamic range \ai\ = 0(1 /^/N). This 
ensures that the information contained in x 
is spread among the coefficients nearly 
uniformly. We call such representations of 
vector X Kashin representations. In this paper 
we demonstrate the following: 

(a) there exist frames (wi)fli "w'lih 
redundancy factor N /n as close as one likes 
to 1, and such that every vector x £ C" has 
a Kashin representation; 
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(b) such frames are those that satisfy a 
form of the uncertainty principle. More pre- 
cisely, their matrices satisfy a weak version 
of the restricted isometry property introduced 
by Candes and Tao [8]. In particular, many 
natural random frames have this property; 

(c) there is a fast algorithm which converts 
frame representation jLlI ) into a Kashin rep- 
resentation of X. 

Kashin's representations withstand errors 
in their coefficients in a very strong way - 
the representation error gets bounded by the 
average, rather than the sum, of the errors 
in the coefficients. These errors may be of 
arbitrary nature, including distortion (e.g. due 
to scalar quantization) and losses (e.g. due to 
transmission failures). 

The article is organized as follows. Sec- 
tion introduces Kashin's representations, 
discusses their relation to convex geometry 
(Euclidean projections of the cube) and ex- 
plains how one can use Kashin's representa- 
tions for vector quantization. In Section |llll 
we discuss the uncertainty principle for ma- 
trices and frames. Theorems 13.51 and 13.91 state 
that for frames that satisfy the uncertainty 
principle, every frame representation can be 
replaced by Kashin's representation. A robust 
algorithm is given to quickly convert frame 
into Kashin's representations. In Section IIVI 
we discuss families of matrices and frames 
that satisfy the uncertainty principle. These 
include: random orthogonal matrices, random 
partial Fourier matrices, and a large family of 
matrices with independent entries (subgaus- 
sian matrices), in particular random Gaussian 
and Bernoulli matrices. 

II. Kashin's REPRESENTATIONS 

A. Frame representations. 

A sequence (wi)iLi C C" is called a tight 
frame if it satisfies Parseval's identity 

N 

i=l 

This definition differs by a constant normal- 
ization factor from one which is often used in 
the hterature, but (III, lb will be more conve- 
nient for us to work with. 

A frame (u^)^]^ C C" can be identified 
with the n x N frame matrix U whose columns 
are Ui. The following properties are easily 
seen to be equivalent: 

1) (Mi)^j^ is a tight frame for C"; 

2) every vector x G C" admits frame 
representation ( II. lb : 



3) the rows of the frame matrix U are 
orthonormal; 

4) Ui — Phi for some orthonormal basis 
(^i)ili of where P is the orthog- 
onal projection in onto C". 

When N > n, the tight frames are linearly 
dependent systems, so various coefficients 
of the frame representation may carry com- 
mon information about vector x e C" This 
makes frames withstand noise in coefficients 
better than orthonormal bases, see [11], [21], 
[10]. However, using frame representation 
( II. lb may not always be the best way to use the 
frame redundancy. Some coefficients Ui may 
be much bigger than others, and thus carry 
more information about x. In order to help 
information spread in the most uniform way, 
one should try to make all coefficients of the 
same magnitude. Such representations will be 
called Kashin's representations. 

B. Kashin's representations 

Consider a sequence {ui)fLi C C". We say 
that the expansion 

^ K 
X = QiUi, max|aj| < ^=|la;|l2 (II.2) 

2—1 ^ 

is a Kashin's representation with level K of 
vector a; e C". 

Kashin's representation produce the small- 
est possible dynamic range of the coefficients, 
which is smaller than the dynamic range 
of the frame representations. This is the con- 
tent of the following simple observation: 

Observation 2.1 (Optimality): Let {ui)fLi 
be a tight frame in C". Then: 

(a) There exists a vector x G C" for which 
the coefficients — {x, Ui) of the frame 
representation dl.lb satisfy 

max|ai| > y'^||a;||2. 

(b) For every vector x G C", every repre- 
sentation of the form x — J2iLi '^i^i satisfies 

max|a,| > ^||a;|j2. 
Proof. (a) Since the tight frame satisfies 
12iLi II "ill 2 ~ ot^e has maxi ||ui||2 > 
^Jn/N . From this part (a) follows. 

(b) The correspondence between tight 
frames and orthonormal bases (property 4) 
above) yields Bessel's inequality ||a;||2 < 
(2^=1 \ai?Y''^, from which part (b) follows. 
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Not every tight frame admits Kashin's rep- 
resentations with constant level K; this is 
clear if one considers an orthonormal basis in 
C" repeated ^ N/n times and properly nor- 
malized. Nevertheless, some natural classes of 
frames do have this property. 

We start with the following existence result. 

Theorem 2.2 (Existence): There exist tight 
frames in C" with arbitrarily small redun- 
dancy A = N/n > 1, and such that every 
vector X G C" has Kashin's representation 
with level K that depends on A only (not on 
n or N). 

Proof. This statement is in essence a re- 
formulation of the classical result from ge- 
ometric functional analysis due to Kashin 
[28] (with an op timal dependence K — 
jhj) given later by Garnaev 
and Gluskin [17]). To see this, we shall look 
at Kashin's representations from the geometric 
viewpoint. Let — {x : ||a;||oo < 1} and 
= {x : \\x\\2 < 1} stand for the unit 
cube and unit Euclidean ball in and C" 
respectively. The observation below follows 
directly from the definition of Kashin's rep- 
resentations. 

Observation 2.3: (Kashin's representations 
and projections of the cube): Consider a tight 
frame {ui)fLi in C" and a number K > 0. 
The following are equivalent: 

(i) Every vector x e C" has a Kashin's 
representation of level K with respect to the 
system (wi)^i; 

(ii) The n x N matrix U whose columns 
are Ui satisfies 

B" C J^UiQ^). (11.3) 
V N 

Inclusion ( III.3I ) yields an equivalence 

C J^U{Q^) C KB", (11.4) 
V N 

the second inclusion holds trivially. Since the 
rows of the frame matrix U are orthonormal, 
the operator U : C" is unitarily 

equivalent to an orthogonal projection. We 
thus may say that U realizes Euclidean pro- 
jection of the cube. We refer the reader to [34] 
Section 6 for more thorough discussion of this 
topic. 

Kashin's theorem [28] states that there ex- 
ists an orthogonal projection of the unit cube 
in onto a subspace of dimension n, which 
is equivalent to a Euclidean ball and the coeffi- 
cient K depends on the redundancy A — N / n 
only. In other words, there exists an n x 
matrix U whose rows are orthonormal and 
which satisfies (III.4I I. 



The first inclusion in (III.4I I means that the 
columns Ui of the matrix U form a system for 
which every vector has a Kashin's represen- 
tation. Since the rows of U are orthonormal, 
(ui) is a tight frame. This proves Theorem l2.2l 

■ 

In geometric functional analysis, many 
classes of matrices U are known to realize Eu- 
clidean projections of the cube as in ( III.4I ). We 
discuss them in more details in Section IIV-AI 
In fact we will see that random matrix U 
with orthonormal rows picked with respect 
to a rotationally invariant distribution satisfies 
(III.4I ) with high probability. 

Remark Since the level K of Kashin's rep- 
resentation depends on redundancy only, this 
representation become especially efficient in 
high dimensions when when the factor y/n in 
the expression for the dynamic range of the 
frame expansion overpowers the value of K 
(which ideally is O log x^) There- 

fore we are interested mainly in low redundant 
frames just in order to avoid getting too large 
volumes of information to be transmitted. 

C. Stability, vector quantization 

Kashin's representations have maximal 
power to reduce errors in the coefficients. 
Indeed, consider a tight frame (wi),fli 
but instead of using frame representations 
we shall use Kashin's representations with 
some constant level K — 0(1). So we rep- 
resent a vector x £ C", ||a;||2 < 1, with 
its Kashin's coefficients (oi, . . . , qn) G , 
I Oil < K/\/N. Assume these coefficients are 
damaged (due to quantization, losses, flips of 
bits, etc.) and we only know noisy coeffi- 
cients (ai,...,ajv) € . When we try to 
reconstruct x from these coefficients as x = 
^f=i ^iUi, the accuracy of this reconstruction 
is 

N 

11^ ~ = y^(aj - ai)ui 

II ^-^ 2 

i=l 

<(^|a,-a,p) . (II.5) 

Combined with the fact that the coefficients 
a.i have the dynamic range 0(l/-\/]V), this 
yields greater robustness of Kashin's represen- 
tations with respect to noise, and in particular 
to quantization errors. Suppose we need to 
quantize a vector x S C". We may do this by 
quantizing each coefficient Oi separately by 
performing a uniform scalar quantization of 
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the dynamic range [—K/^/N,K/y/N] with, 
say, L levels. The quantization error for each 
coefficient is thus ja^ — di| < K/L^/N. By 
(III.SI l. this produces the overall quantization 
error 

\\x-x\\2<K/L = 0{l/L). 

Similar quantization of frame representations 
(II. lb would only give the bound 

- x\\2 < y/n/L 

because its dynamic range is y/n larger than 
that of Kashin's representations (by Observa- 
tion Oi. 

Kashin's decompositions also withstand ar- 
bitrary errors made to a small fraction of the 
coefficients a^. These may include losses of 
coefficients and arbitrary flips of bits. Sup- 
pose at most 6N coefficients (ai, . . . , un) are 
damaged in an arbitrary way, which results 
in coefficients (oi, . . . , oat). Since all \ai\ < 
K/^/N, we can assume (by truncation) that 
all \di\ < K/^/N. When we reconstruct x 
from these damaged coefficients (as before), 
the accuracy of this reconstruction can be 
estimated using (III.SI l as 

\\x~x\\2 < 2KVS = 0{VS). 

Thus the reconstruction error is small when- 
ever the (related) number of damaged coeffi- 
cients S is small. 

By Theorem 12.21 the maximal error reduc- 
tion effect is achieved using frames with only 
a constant redundancy, in fact any redundancy 
factor A = N/n > 1 has the error reduction 
power of maximal possible order. This is in 
contrast with traditional methods, in which 
increasing redundancy of the frame gradually 
reduces the representation error 

III. Computing Kashin's 

REPRESENTATIONS 

Computing the coefficients Ui of Kashin's 
representation ill.2i of a given vector x can 
be described as a linear feasibility problem, 
which can be solved in (weakly) polynomial 
time using linear programming methods. 

In this paper, we take a different approach 
to computing Kashin's representations, by es- 
tablishing their connection with the uncer- 
tainty principle. This will have several advan- 
tages over the linear programming approach: 

1) Whenever a frame (ui) satisfies the un- 
certainty principle, one can effectively 
transform every frame representations 
into Kashin's representation. This will 



take O(logA^) multiplications of the 
matrix [/ by a vector 

2) The uncertainty principle will thus be a 
guarantee that a given frame (ui) yields 
Kashin's representation for every vector 
This can help to identify frames that 
yield Kashin's representations. 

3) The algorithm to transform frame repre- 
sentations into Kashin's representations 
is simple, natural, and robust. It has 
a potential to be implemented on ana- 
log devices. Followed by some robust 
scalar quantization of coefficients (such 
as one-bit /3-quantization [13], [14]), 
this algorithm may be used for robust 
one-bit vector quantization schemes for 
analog-to-digital conversion. 

A. The uncertainty principle 

The classical uncertainty principle says that 
a function and its Fourier transform cannot 
be simultaneously well-localized. We refer the 
reader to fundamental monograph [25] for his- 
tory survey and also for numerous realization 
of this heuristic rule. In particular a variant 
of the uncertainty principle due to Donoho 
and Stark [16] states that if / G i2(K) is 
"almost concentrated" on a measurable set 
T while its Fourier transform / is "almost 
concentrated" on a measurable set 51, then 
then the product of measures |T||51| admits 
a natural low bound. Donoho and Stark pro- 
posed applications of this principle for signal 
recovery [16]. 

For signals on discrete domains no satisfac- 
tory version of the uncertainty principle was 
known until recently. For the discrete Fourier 
transform in the uncertainty principle 
states that |supp(x)||supp(5:)| > N for all 
X € (see [16]). This inequality is sharp 
- both terms in this product can be of order 
y/N. 

In papers by Candes, Romberg and Tao [4], 
[7], [3] and by Rudelson and Vershynin [35], 
[36], a much stronger discrete uncertainty 
principle was established for random sets of 
size proportional to N. Moreover, one of these 
sets (say support of the signal in frequency 
domain) can be arbitrary (non-random), and 
the other (random support in time domain) can 
be almost the whole domain. The following 
result is a consequence of [35], [36]: 

Theorem 3.1 (uncertainty principle): Let 
= (1 + /i)n for some integer n and 
/i £ (0,1). Consider a random subset il of 
{0, ...,A^ — 1} of average cardinality n, 
which is obtained from independent random 
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{0, l}-valued variables Sq, . . . S^^i with 
ES, = n/N as 17 := {i : ^ 1}. Then O 
satisfies the following with high probability. 
For every z e C^, 

supp(z) C implies |supp(z)| > SN, 

where 5 = cfi^/log'^N and c > is an 
absolute constant. 

Moreover, for every x e C^, |supp(a;)| < 
SN, one has 



lfi||2 < (1-ca*)||.t||2, 



(III.l) 



where Iq denotes the indicator function of il. 

The first, qualitative, part of the theorem 
easily follows from the second, quantitative 
part with z = a;. If supp(x) C fi and 
|supp(x)| < SN then, by the second part, 
\\x\\2 = \\x ■ I0II2 < ||a;||2, which would 
contradict Parseval equality. 

We can regard inequality ( IIII.ll i as a prop- 
erty of the partial Fourier matrix U, which 
consists of the rows of the DFT (discrete 
Fourier transform) matrix $ indexed by the 
random set ft. Then ( IIII.lt says that ||?7a;||2 < 
(1 — c/i) II a;| 1 2 for all vectors x G such that 
|supp(a;)| < SN. 

Now we can abstract from the harmonic 
analysis in question and introduce a general 
uncertainty principle (UP) as a property of 
matrices. 

Definition 3.2 (UP for matrices): An n x 
A'^ matrix U satisfies the uncertainty principle 
with parameters ?7, ^ G (0, 1) if, for x e C^, 

|supp(a;)| < SN implies ||?7a;||2 < »/||a;||2- 

(111.2) 

We will only use the uncertainty principle 
for matrices U with orthonormal (or almost 
orthonormal) rows, in which case it is always 
a nontrivial property. 

A related uniform uncertainty principle 
(UUP) was introduced by Candes and Tao in 
the context of the sparse recovery problems 
[8]. The UUP with parameters £,S e (0,1) 
states that there exists A > such that, for 
X G C^, the condition |supp(a;)| < SN 
implies 



A(l 



k||2< ||(7x||2 < A(l+£)||a;||2. 



See also [5], [6] for more refined versions. 
Known also as the Restricted Isometry Condi- 
tion, UUP was shown in [8] to be a guarantee 
that one can efficiently solve underdetermined 
systems of linear equations Ux ~ h under 
the assumption that the solution is sparse, 
|supp(x)| < SN. This is a part of the fast 
developing area of Compressed Sensing [9]. 



The uncertainty principle is a weaker as- 
sumption (thus easier to verify) than the UUP: 

Observation 3.3: For matrices with or- 
thonormal rows, the UUP with parameters e, S 
implies the uncertainty principle with param- 
eters 77 — J^y^, S. 

Proof. Since the columns u, of the matrix U 



satisfy J2f=: 



1 2 = 71, there exists a column 



with norm \\ui\\2 < y/ n/N. This column is 
a preimage of some 1-sparse unit vector e^, 
i.e. Ui = Uci where = (0, . . . , 0, 1, . . . , 0) 
with 1 on the i-th place. Using the UUP for 
X — e," we obtain 



A(l-e)<||7i,||2<\/^. 



Hence A < 



^. In view of this estimate, 
the upper bound in the UUP reads as follows: 

supp(a::)| < SN implies 



\Ux\\2 < 



1 



Xh. 



1-e V ^ 

This is what we wanted to prove. ■ 

The uncertainty principle can be reformu- 
lated as a property of systems of vectors 
{ui)fLi, which form the columns of the matrix 
U. We will use it for tight (or almost tight) 
frames, in which case it is a nontrivial prop- 
erty: 

Definition 3.4 (UP for frames): A system 
of vectors {ui)fLi in C" satisfies the uncer- 
tainty principle with parameters i],S E (0,1) 
if 

1 /2 



aiUi 



for every subset C {1,2, 
SN. 



< 



B. Converting frame representations into 
Kashin 's representations 

For every tight frame that satisfies the 
uncertainty principle, one can convert frame 
representations into Kashin's representations. 

The conversion procedure is natural and 
fast. We truncate the coefficients of the frame 
representation ( 11.11 ) of x at level M = 
\\x\\2/VSN in hope to achieve a Kashin's 
representation with level K = l/VS. How- 
ever, the truncated representation may sum 
up to a vector x^^^ different from x. So we 



consider the residual x 



compute Its 



frame representation and again truncate its 
coefficients, now at a lower level rjM. We 
continue this process of expansion, truncation 
and reconstruction, each time reducing the 
truncation level by the factor of ?/. 
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Using the uncertainty principle, we will be 
able to show that the norm of the residual 
reduces by the factor of rj at each iteration. 
So we can compute Kashin's representations 
of level K — K{rj, 6) with accuracy e in 
0(log(l/e)) iterations. Our analysis of this 
algorithm will yield: 

Theorem 3.5 (frame to Kashin conversion): 
Let {ui)f^i be a tight frame in C" which 
satisfies the Uncertainty Principle with 
parameters 77, 5. Then each vector x G C" 
admits a Kashin representation of level 

= (1 - ?7)-i(5-i/2. 

In order to prove this result, we introduce 
and study the truncation operator for frame 
representations. Given a number M > 0, 
the one-dimensional truncation at level M is 
defined for z E C \ {0} as 



as 



■ min{|2;|, M}, 



(111.4) 



and tM{0) = 0. 

Consider a frame {ui)fLi satisfying the as- 
sumptions of the theorem. For every x G C", 
we consider the frame representation 

N 

X = biUi where 6,; — {x, Ui) 

i=l 

and define the truncation operator on C" as 

N 

Tx = ^ biUi where bi =tM{bi) 

i=l 

and M ||a;||2/\/^. (III.5) 

The uncertainty principle helps us to bound 
the residual of the truncation: 

Lemma 3.6 (Truncation): In the above no- 
tations, for every vector x £ C" we have 

||a;-rx||2 <r;||a;||2. (III.6) 
Proof. Let x G C". Consider the subset 
r2 C {!,..., iV} defined as 

n = {i: b,^k} = {i: \br\ > M}. 
By the definition of tight frame, we have 

N 



thus 



J2\b,\' > \n\M' 



\n\ < \\x\\1/aP = SN. 



Using the uncertainty principle, we can esti- 
mate the norm of the residual 



X - Tmx = ^(6,; - bi 



\\x - Tmx\\2 <ri(^ 



1 /2 

ieO, i=l 

This completes the proof. ■ 

Proof of Theorem 3.5 Given a; G C", for 
A; = 1, 2, ... we define the vectors 



El 

N 



1/2 



1/2 



< 



Then, for each r = 0, 1, 2, . 



. we have 



It follows from Lemma [3761 by induction that 

\\x^''^h < v'^Wxh, thus 

00 

fe=0 

Furthermore, by the definition of the trun- 
cation operator T, each vector Tx'^''^ has 
an expansion in the system {ui)fLi with 
coefficients bounded by \\x'^'''^\\2/VSN < 
ri''\\x\\2/V6N. Summing up these expansions 
for k = 0,1,2,..., we obtain an expan- 
sion of X with coefficients bounded by (1 — 
v)^^\\x\\2/ VSN. In other words, x admits 
Kashin's representation with level K = (1 — 
r])^^S^^^^. This completes the proof. ■ 

The proof yields an algorithm to compute 
Kashin's representations: 

Algorithm to compute Kashin's 
representations 
Input: 

• A tight frame (ui)fli which sat- 
isfies the uncertainty principle with pa- 
rameters 77, (5 G (0, 1). 

• A vector x G C" and a number of 
iterations r. 

Output: Kashin's decomposition of x with 
level K — (1 — r])~^S~^^^ and with accuracy 
77''||a;||2. Namely, the algorithm finds coeffi- 
cients ai, . . . , ajv such that 



N 

1=1 



< ^11^112, 



max laA < 



K 



\x\\2. (III.7) 



Initialize the coefficients and the truncation 
level: 

\x\\2 



ai ^ 0, i — 1, . 



,iV; M 



'5N 
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Repeat the following r times: 

• Compute the frame representation of x 
and truncate at level M : 

bi {x,Ui), hi ^ tM{bi), i = 1,...,N. 

• Reconstruct and compute the error: 

N 

Tx ^ biUf, X a; — Tx. 
1=1 

m Update Kashin's coefficients and the 
truncation level: 



any function t{z) : ( 
some I/, T e (0, 1): 



which satisfies for 



ai -^r— ai + v N bi, i — l,...,N; 
M ^ rjM. 



Remark. (Redistributing information). One 
can view this algorithm as a method of redis- 
tributing information among the coefficients. 
At each iteration, it "shaves off" excessive 
information from the few biggest coefficients 
(using truncation) and redistributes this excess 
more evenly. This process is continued until 
all coefficients have a fair share of the infor- 
mation. 

Remark. ( Computing exact Kashin 's repre- 
sentations). With a minor modification, this 
algorithm can compute an exact Kashin's rep- 
resentation after r = 0{\ogN) iterations. We 
just do not need to truncate the coefficients bi 
during the last iteration. 

Indeed, for such r, the error factor satis- 
fies rf < Thus, during r-th iteration 
the frame coefficients bi are all bounded by 
-^||x||2, where x is the initial input vector 
So bi are already sufficiently small, and we 
will not apply the truncation at the last itera- 
tion. This yields an exact Kashin's represen- 
tation of X with K' — 2K. 

Remark. (Robustness). The algorithm above 
is robust in the sense of [12]. 

Specifically, the truncation operation (IIII.4l i 
may be impossible to realize on a physical 
signal exactly, because it is expensive to build 
an analog scheme that has an exact phase 
transition at the truncation level \z\ — M. 
A robust algorithm should not rely on any 
assumptions on exact phase transitions of the 
operations it uses. Scalar quantizers that are 
robust in this sense were first constructed by 
Daubechies and DeVore in [12] and further 
developed in [24], [13], [14]. 

Our algorithm is also robust in the follow- 
ing sense: the exact truncation t^/ can be 
replaced by any approximate truncation. Such 
an approximate truncation at level 1 can be 



. M v\z\ if Izl < r, 
\z-t{z)\ <l// II-' (ni.8) 
I |z| for all z, 

and \t{z)\ < 1 for all z. 

The approximate truncation at level M is 
defined as tM^u) := Mt{^). An analysis 
similar to that above yields: 

Theorem 3.7: (Approximate truncation) 
The above algorithm remains valid if one 
replaces exact truncation by any approximate 
one and also adjust the parameters: level 
M should be replaced with M' = t~^M, 
parameter 77 should be replaced with 
Tj' = ^Jrf^~\~L^, finally, level K is replaced 
with K' = r~i(l - i)-^5-^/^, provided 
that 77' < 1. 

Moreover, the approximate truncation can 
be different each time it is called by the 
algorithm, provided that it satisfies ( IIII.8I 1. 
This facilitates the algorithm implementation 
on analog devices. In particular, one can use 
this algorithm to build robust vector quantizers 
for analog-to-digital conversion. 

Remark. (Almost tight frames). Similar re- 
sults also hold for frames that are almost, but 
not exactly, tight. This is important for natural 
classes of frames, such as random gaussian 
and subgaussian frames (see Theorem 14.6b . 
Definition 3.8: For £ G (0, 1), a sequence 



iu^) 



N 



c 



is called an e-tight frame if 



N 

1=1 



1/2 



< (1 + £)||.t||2 for all x e C". (III.9) 
An analysis similar to that above yields: 
Theorem 3.9: Let {u,)fL^ C C" be an e- 
tight frame, which satisfies the uncertainty 
principle with parameters 77 and S. Then The- 
orem 13.51 and the algorithm above are valid 
for M replaced with A/' = \/l + e M and 77 
replaced with rj' = \/l + e i] + e, provided 
that 77' < 1. 

Remark. (History). The idea behind The- 
orem [33] is certainly not new. Gluskin [19] 
suggested to use properties that involved only 
II • II 2 norms (like our uncertainty principle) 
to deduce results on Euclidean sections of ^" 
(which by duality is equivalent to Euclidean 
projections ( III. 31 ) of a cube). A similar idea 
was essentially used by Talagrand in his work 
on the Ai problem [38]. 

The algorithm to compute Kashin's repre- 
sentations resembles the Chaining Algorithm 
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of [18], which also detects a few biggest 
coefficients and iterates on the residual, but 
it serves to find all big coefficients rather than 
to spread them out. 

IV. Matrices and frames that satisfy 

THE UNCERTAINTY PRINCIPLE 

In this section, we give examples of matri- 
ces (equivalently, frames) that satisfy the un- 
certainty principle. By Observation 12.31 such 
nxN matrices U realize Euclidean projection 
of the cube ( III. 31 ). Equivalently, these frames 
{ui)fLi (the columns of U) yield quickly 
computable Kashin's representations for every 
vector X £ C". 

A. Matrices known to realize Euclidean pro- 
jections of the cube 

Much attention has been paid to Euclidean 
projections of the cube (III. 3b in geometric 
functional analysis. Results in the literature 
are usually stated in the dual form, about n- 
dimensional Euclidean subspaces of . 

Kashin proved (III.3I I for random orthogonal 
n X N matrix U (formed by the first n rows 
of a random matrix in 0{N)), with N = Xn 
for arbitrary A > 1, and with exponential 
probability ([28], see also [34] Section 6.) The 
level K (III. 3b depends only on A; an optimal 
dependence was given later by Garnaev and 
Gluskin [17]). 

A similar result holds for U = — where 

VN 

<I> is a random Bernoulli matrix, which means 
that the entries of $ are ±1 symmetric inde- 
pendent random variables. Schechtman [37] 
first proved this with N — 0{n), and in [31] 
this result is generalized for N = Xn with 
arbitrary A > 1. The dependence K on X was 
improved recently in [1]. In fact, these results 
hold for a quite general class of subgaussian 
matrices (which includes Bernoulli and Gaus- 
sian random matrices). 

In is unknown whether Kashin's theorem 
holds for partial Fourier matrix; this conjec- 
ture is known as the Ai problem. Consider 
the Discrete Fourier Transform in C^, where 
N — 0{n), given by the orthogonal NxN 
matrix $. It is unknown whether there exists a 
submatrix U which consists of some n rows 
of $ and such that it realizes an Euclidean 
projection of the cube in the sense of ( III. 3b . 

In the positive direction, a partial result 
due to Bourgain, later reproved by Talagrand 
with a general method [38], states that a 
random partial Fourier matrix U satisfies (III. 4b 
with high probability for N — 0{n) and 



K = 0{y^\og{N) loglog(A^)). It was re- 
cently proved in [23] that Bourgain's result 
holds for arbitrarily small redundancy, that is 
for N — Xn with arbitrary A > 1, however at 
the cost of a slightly worse logarithmic factor 
in K. A similar result can also be deduced 
from Theorem 14.31 below (along with Theo- 
rem 13.51 and 12.3b . which is a consequence of 
the uncertainty principle in [35], [36]. 

No explicit constructions of matrices U are 
known. However, there exists small space con- 
structions that use a small number of random 
bits [2], [26], [27]. 

B. Random orthogonal matrices 

We consider random nx N matrices whose 
rows are orthonormal. Such matrices can be 
obtained by selecting the first n rows of 
orthogonal NxN matrices. Indeed, denote 
by 0{N) the space of all orthogonal NxN 
matrices with the normalized Haar measure. 
Then 

0{nx N) ^ {PnV;V eO{N)}, (IV.l) 

where P„ : — > C" is the orthogonal 
projection on the first n coordinates. The 
probability measure on 0{n x N) is induced 
by the Haar measure on 0{N). 

Theorem 4.1: (UP for random orthogonal 
matrices) 

Let /X > and A^ = (1 + ^)n. Then, 
with probability at least 1 — 2 exp(— c/i^n), a 
random orthogonal n x N matrix U satisfies 
the uncertainty principle with the parameters 

4 log(l/Ai) 
where c > is an absolute constant. 

Remark. Assumption /i > is not essential; 
just expressions for 77 and 5 will look differ- 
ently. We are most interested in small values 
of /i when redundancy is small. 

The proof of Theorem 14.11 uses a standard 
scheme in geometric functional analysis - 
the concentration inequahty on the sphere 
followed by an e-net argument. Denote by 
S^^^ and (7N-1 the unit Euclidean sphere 
in and the normalized Lebesgue measure 
on ^^-1. 

Lemma 4.2: For arbitrary t > 0, a: G 
5^^^, we have 

p|||C/a;||2 > (1 + ^)^^! < 2exp(-cit2n), 

where ci > is an absolute constant. 
Proof. We use representation dl V. 1 1 ) and also 
the fact that z = Vx is a random vector 
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uniformly distributed on ^. Thus Ux is 
distributed identically with PnZ. We also have 

E := [ \\Pnz\\2daN-iiz) < 

The map z H> |i^ri-z|| is a 1-Lipschitz function 
on S^~^. The concentration inequality (see 
e.g. [29] Section 1.3) then implies that this 
function is well concentrated about its average 
value E: 

P{||?7a;||2 > E + u} < 
aN-i{z e S^-^ : ||lP„z||2 -E\> u}) < 
2exp(-cw^7V). 



Choosing u — ty/n/N completes the proof. 

■ 

Proof of Theorem 14.11 Assume that rj and 
S satisfy the assumptions ( IIV.2b . We have to 
prove that ( IIII.2b holds with probability at 
least 1 — exp(— c//^n) . 
Consider the set 



{xeS 



|supp(a;)| < SN}. 



We have 

U Si, 

\I\<SN 

here the union is taken over all subsets / of 
{1,...,A^} of cardinality at most SN, and 
Si = S^^'^ n C-^ is the set of all unit vectors 
whose supports lie in /. Let e > 0. For each 
/, we can find an e-net of Si in the Euclidean 
norm, and of cardinality at most (see 
e.g. [34] Lemma 4.16). Taking the union over 
all sets / with |/| = \SN~\ , we conclude by the 
Stirling's bound on the binomial coefficients 
that there exists an e-net A/^ of 5 of cardinality 



Then using Lemma \4~2\ we obtain 



F{3yeAf: \\Uy\\2 > {I + t)^ -} < 

\Af \ • 2exp(-cit^n). 

Every x G S can be approximated by some 
y G Af within e in the Euclidean norm, and 
since U has norm one, we have 

||(7x||2<||;72/||2 + ||[/(a:-2/)||2<|l[/2/|l2 + £. 
Therefore 



F{3xeS: \\Ux\\2>{l+t)^-+e}< 
\Af\-2exp{-cit^n). (IV.3) 



It now remains to choose parameters appropri- 
ately. Let t = ji/h and e = Then since 
N / n ^1 + ji and by the assumption on -q in 
(IIV.2I 1. we have 



+ 



e < rj. 



Also, we can estimate the probability in ( IIV.3b 

as 

|7V| • 2exp(-rf^n) < 
/ 24e\ 

i^—j •2exp(-C2<2n) < 

2exp [(2^1og(24e/(5Ai) - C2^i^)n\, (IV.4) 

where C2 = ci/25. By our choice of S, 
the right hand side of (IIV.4b is bounded by 
2 exp(— c/x^n), where c > is an absolute 
constant. We conclude that 

¥{3x e S : \\Ux\\2 > ??} < 2exp(-c/i2n). 

This completes the proof. ■ 

C. Random partial Fourier matrices 

An important class of matrices that satisfy 
the uncertainty principle can be obtained by 
selecting n random rows of an arbitrary or- 
thogonal N X N matrix $ whose entries are 
0{N~^^^). Here n can be an arbitrarily big 
fraction of N, so the Uncertainty Principle 
will hold for almost square random submatri- 
ces. This class includes random partial Fourier 
matricies, multiplication by such matrix cor- 
responds to sampling n random frequencies 
of a signal. 

More precisely, we select rows of $ using 
random selectors Si, . . . ,Sn - independent 
Bernoulli random variables, which take value 
1 each with probability n/N. The selected 
rows will be indexed by a random subset 
n = {i : Si ^ 1} of {!,..., N}, whose 
average cardinality is n. 

Theorem 4.3: (UP for random partial 
Fourier matrices) 

Let $ be an orthogonal N x N matrix with 
uniformly bounded entries: < aN~-^/^ 

for some constant a and all Let n be an 
integer such that iV = (1 + p)n for some 
p e (0, 1]. Then for each p £ (0, 1) there 
exists a constant c = c{p, a) > such that 
the following holds. 

Let U he a submatrix of <i> formed by 
selecting a subset of the rows of average 
cardinality n. Then, with probability at least 
1 — p, the matrix U satisfies the uncertainty 
principle with parameters 

7J=1-^, S- 



4' 



C/i 



los^N 
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Theorem 14.31 is a direct consequence of a 
slightly stronger result established in [7] and 
improved in [35], [36]. For an operator U on 
a EucUdean space, 1 1 • 1 1 will denote its operator 
norm. 

Theorem 4.4: (UUP for partial Fourier 
matrices [35], [36]) 

Assume the hypothesis of Theorem 14.31 is 
satisfied. Then there exists a constant C ~ 
C{a) > such that the following holds. Let 
r > and e S (0, 1) be such that 

/7^1ogiV\ /rlogiVN 2 
n > C[^^ ) log ) log r. 

Then the random submatrix U satisfies: 

N 

E sup \\idT UtUtW < s. (IV.5) 

\T\<r n 

Here the supremum is taken over all subsets 
T of {1, ... , N} with at most r elements, Ut 
denotes the submatrix of U that consists of the 
columns of U indexed in T, and idr denotes 
the identity on C^. 



the condition on n in Theorem 14.41 is satisfied 
if 

cie'^N 



Proof of Theorem |4.3[ Observe that for a 
linear operator A on one has 

= sup \{{id-A*A)x,x) 

xec'^,\\x\\2=i 

sup - ||a;||^|. (IV.6) 

xeC",||a;||2 = l 

We use this observation for A = \ —Ut- 
Since Utx = Ux whenever supp(x) C T, 
we obtain 



E 



sup 

|supp(2;) I <r 



N 

-\\Ux\\ 
n 



< e. 



By Markov's inequality, with probability at 
least 1 — p the random matrix U satisfies: 



N . 



UxWl 



1 



for all X e C^, |supp(2:)| < r, |la;|l2 = 1. 
In particular, for such U, one has: ||C/a:||2 < 




for all xeC'\ |supp(a;)| < 



Then, if we set s = cpfi for an appropriate 
absolute constant c > 0, we can bound the 
factor 



i±£/^<i-^ = ,. 



This proves the uncertainty principle (|III.2| i 
with S = r/N. To estimate 6 we note that 



r < 



hg^N 



(IV.7) 



where ci = ci{a) > 0. Since we have set 
£ = cp/i, condition ( lIV.TI i is equivalent to 

,2 



s> 



CfJ, 



log^iV 



where c — c{a) > 0. This completes the proof 
of Theorem 14.31 H 



Remarks. 

1. (Computing in almost linear time). The 
Fourier matrices can be used to compute 
Kashin's representations in C" in time almost 
linear in n. Indeed, let for example N = 2n. 
The columns of the n x N partial Fourier 
matrix form a tight frame in C". By Theo- 
rem 14.31 and Section IIII-BI we can convert a 
frame representation of every vector x G C" 
into a Kashin's representation with level K = 
0(log^ n) in time O(nlog^n). (Recall that 
the algorithm makes O(logn) multiplications 
by a partial Fourier matrix, and each multi- 

1 plication can be done using the fast Fourier 
I transform in time O(nlogn)). 

2. The constant c = c{a,p) depends poly- 
nomially on a and polylogarithmically on p. 
The polynomial dependence on a is straight- 
forward form the proof of Theorem 14.41 in 
[35], [36]. The proof above gives a polynomial 
dependence on the probability p. To improve 
it to a polylogarithmic dependence, one can 
use an exponential tail estimate, proved in 
[36] Theorem 3.9, instead of the expectation 
estimate ( IIV.5b . 

3. We stated Theorem 14.31 in the range 
fj, e (0, 1] which is most interesting for us 
(where the redundancy factor is small). A 
similar result holds for arbitrary /i > 0. 

D. Subgaussian random matrices. 

A large family of matrices with indepen- 
dent random entries satisfies the uncertainty 
principle. 

Definition 4.5: A random variable is 
called subgaussian with parameter (3 if 

¥{\<j)\ >u} < exp(l-uV/3^) for all u > 0. 
Examples of subgaussian random variables 
include Gaussian iV(0, 1) random variables 
and bounded random variables. 

Theorem 4.6: (UP for random subgaussian 
matrices) 

Let <i> be a n X matrix whose entries are 
independent mean zero subgaussian random 
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variables with parameter /3. Assume that N — 
An for some A > 2. Then, with probability 
at least 1 — A^", the random matrix U — 
satisfies the uncertainty principle with 
parameters 



77 = C/3 



log A 



A 



(IV.8) 



where C, c > are absolute constants. 
Remark. Theorem 14.61 and Lemma 14.81 
below can be deduced from the recent works 
[32], [33]. However, we feel that it would be 
helpful to include short and rather standard 
proofs of these results here. 

Theorem 14.61 follows easily from an esti- 
mate on the operator norm of subgaussian 
matrix. 

Lemma 4.7: ([30] Fact 2.4) Let n > fc 
and $ be a n X /c matrix whose entries are 
independent mean zero subgaussian random 
variables with parameter /?. Then 

¥{\\m > ty/^} < exp(-ciniV^^) (IV.9) 

for all t> Ci/3, here Ci,ci > are absolute 
constants. 

Proof of Theorem \4.6[ The uncertainty 
principle for the matrix U with parameters 77, 6 
is equivalent to the following norm estimate: 



sup < ryViV, 

\I\ = \SN^ 

where the supremum is over all subsets / C 
{1,...,A^} of cardinality [5iV], and where 
<!>/ denotes the submatrix of $ obtained by 
selecting the columns in /. 

Without loss of generality, c < 1. Since $/ 
is a n X \dN~\ matrix and C2n, < \6N~\ < n. 
Lemma 1471 applies for Taking the union 
bound over all /, we conclude that for every 
t > Ci/3 



F{31 : ||$/|| > ty/^} < 
( ^ 

exp [(log(e/5) - ciiV/32)n] < 

exp(-C3ni^//3^) 



if we choose t = Cl3\/\og A and use our 
choice of 5 = c/A. (Here C3 — ci/2 and C 
are absolute constants). With this choice of t, 
we can write the estimate above as 



P{3/ : ||$/|| > C/3 



log A 



N} < 



Unlike random orthogonal or partial Fourier 
matrices considered in Sections IIV-BI and 
IIV-CI subgaussian matrices do not in gen- 
eral have orthonormal rows. Nevertheless, the 
rows of subgaussian matrices are almost or- 
thogonal, and their columns form almost tight 
frames as we describe below. So, one can use 
Theorem 13.91 instead of Theorem 13.51 to com- 
pute Kashin's representations for such almost 
tight frames. 

The almost orthogonality of subgaussian 
matrices can be expressed as follows: 

Lemma 4.8: Let $ be a n x matrix 
whose entries are independent mean zero sub- 
gaussian random variables with parameter /3 
and with variance 1. There exist constants 
C = C{I3), c = c(/3) > such that the 
following holds. Assume that 



N > 



5-© 



for some e e (0, 1). Then 



1 



> £} < 2ex.p{~cNe'^). 
Remark. The dependence in C{(3), c(/3) is 
polynomial. Explicit bounds can be deduced 
from [33]. 



As a straightforward consequence, we ob- 
tain: 

Corollary 4.9: (Subgaussian frames are al- 
most tight) Let $ be a subgaussian matrix as 
in Lemma l4~8] Then the columns of the matrix 
^=<i> form an e-tight frame (uAfL-, in C". 
Proof of Lemma I4.8[ In this proof, 
Ci, C2, ci, C2, . . . will denote positive abso- 
lute constants. By a duality argument as in 



\id-^<P<P*\\= sup 



1 



Denote the columns of $ by (pi. Fix a vector 
X e C", ||a;||2 = 1. Since the entries of 
the vector (pi are subgaussian with parameter 
/?, the random variable {(pi , x) is also sub- 
gaussian with parameter Ci/3, where Ci is 
an absolute constant (see Fact 2.1 in [30]). 
Moreover, this random variable has mean zero 
and variance 1. We can use Bernstein's in- 
equality (see [39]) to control the average of 
the independent mean zero random variables 
|(0„x)p-las V{\j^\\'^*x\\l-l\>u} = 

N 



ri 1 1 

exp(-c3C2nlogA) < A-" 1 >"} 



provided we choose the absolute constant C 
sufficiently big. This means that the uncer- 
tainty principle with parameters (IIV.8I 1 fails 
with probabiUty at most A^". ■ 



2exp(-ci7VuV/3'') 

for all u < c[3, where ci > is an absolute 
constant. 
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Denote U — -^<i>. There exists a u- 
net M of the sphere 5*"^^ in the EucHdean 
norm, and with cardinahty \Af \ < (3/u)" (see 
e.g. [34] Lemma 4.16). Using the probability 
estimate above, we can take the union bound 
to estimate the probability of the event 

A:={VyeAA: \\\U* yg - l\ < u} 

as 

P(A") < (3/m)" • 2exp(-ci7VuV/3''). 

Applying Lemma 14.71 with t — Ci/3, we 
see that the event B {\\U*\\ < Ci/3} 
satisfies P(i?^) < exp(— C2A^). Consider a 
realization of the random variables for which 
the event AO B holds. For every x E 5"^^, 
we can find an element of the net y E JV 
such that II a; — y II 2 < u, which implies by the 
triangle inequality that 

ll|f/*^l|2-l|< 

|||C/*2/|h-l| + |||C/*:r||2-||t/*y|b|< 
|||[/*y||2-l| + ||C/*(.T~y)||2 < 

u + 2Ci/3u < C2l3u, 

where C2 = 1 + 2Ci. Now let u = £/3C2/3. 
Thus C2/3U = e/3 E (0, 1), and the estimate 
above yields |||?7*x||2 — 1| < e for all x E 
5"^^ once the event AO B holds. Thus 

F\\\id - — $$*|| > el < 

Lll ^ II J - 

¥{3x E S"-'^ : \\\U*x\\l - 1| > £} < 
P{{AnBY) < 
(3/w)" • 2 exp(-ciiVwV/3'*) + exp(-C2 A^) < 

2exp(-ciV£^) 

by our choice of u and by the assumption on 

N. m 
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